Requirements for Science Data Bases and SciDB
نویسندگان
چکیده
For the past year, we have been assembling requirements from a collection of scientific data base users from astronomy, particle physics, fusion, remote sensing, oceanography, and biology. The intent has been to specify a common set of requirements for a new science data base system, which we call SciDB. In addition, we have discovered that very complex business analytics share most of the same requirements as “big science”. We have also constructed a partnership of companies to fund the development of SciDB, including eBay, the Large Synoptic Survey Telescope (LSST), Microsoft, the Stanford Linear Accelerator Center (SLAC) and Vertica. Lastly, we have identified two “lighthouse customers” (LSST and eBay) who will run the initial system, once it is constructed. In this paper, we report on the requirements we have identified and briefly sketch some of the SciDB design.
منابع مشابه
A Demonstration of SciDB: A Science-Oriented DBMS
In CIDR 2009, we presented a collection of requirements for SciDB, a DBMS that would meet the needs of scientific users. These included a nested-array data model, sciencespecific operations such as regrid, and support for uncertainty, lineage, and named versions. In this paper, we present an overview of SciDB’s key features and outline a demonstration of the first version of SciDB on data and o...
متن کاملReport from the SciDB Workshop
A mini-workshop with representatives from the data-driven science and database research communities was organized in response to suggestions at the first XLDB Workshop. The goal was to develop common requirements and primitives for a next-generation database management system that scientists would use, including those from high-energy physics, astronomy, biology, geoscience and fusion, in order...
متن کاملPerformance Comparison of Big-Data Technologies in Locating Intersections in Satellite Ground Tracks
The performance and ease of extensibility for two Big-Data technologies, SciDB and Hadoop/MapReduce (HD/MR), are evaluated on identical hardware for an Earth science use case of locating intersections between two NASA remote sensing satellites’ ground tracks. SciDB is found to be 1.5 to 2.5 times faster than HD/MR. The performance of HD/MR approaches that of SciDB as the data size or the cluste...
متن کاملSciDB DBMS Research at M.I.T
This paper presents a snapshot of some of our scientific DBMS research at M.I.T. as part of the Intel Science and Technology Center on Big Data. We focus our efforts primarily on SciDB, although some of our work can be used for any backend DBMS. We summarize our work on making SciDB elastic, providing skew-aware join strategies, and producing scalable visualizations of scientific data.
متن کاملReport from the 2nd Workshop on Extremely Large Databases
The complexity and sophistication of large scale analytics in science and industry have advanced dramatically in recent years. Analysts are struggling to use complex techniques such as time series analysis and classification algorithms because their familiar, powerful tools are not scalable and cannot effectively use scalable database systems. The 2 Extremely Large Databases (XLDB) workshop was...
متن کامل